Load required package:
library(marmap)
Import bathymetry data from NOAA:
b = getNOAA.bathy(lon1 = 115, lon2 = 140, lat1 = 35, lat2 = 20, resolution = 1)
Import dataframe with the latitude and longitude of sampling points from CTD (in decimal degrees):
pts <- read.csv("Mirai_CTD_lat_long.csv")
pts$Station<- as.character(pts$Station)
str(pts)
## 'data.frame': 14 obs. of 3 variables:
## $ Lon : num 126 128 127 126 125 ...
## $ Lat : num 26.3 25.4 25.9 26.5 25.9 ...
## $ Station: chr "2" "3" "4" "5" ...
Make dataframe with map labels and their lat/long:
Clon = c(127.5,131.25, 121.6, 121.03, 120.8, 126)
Clat= c(26, 31.93, 31.1, 23, 27.2, 22)
CLabels= c("Okinawa", "Japan", "China", "Taiwan", "East China Sea", "Philippine Sea")
Countries = cbind.data.frame(Clon, Clat, CLabels)
str(Countries)
## 'data.frame': 6 obs. of 3 variables:
## $ Clon : num 128 131 122 121 121 ...
## $ Clat : num 26 31.9 31.1 23 27.2 ...
## $ CLabels: Factor w/ 6 levels "China","East China Sea",..: 4 3 1 6 2 5
Make a dataframe with vent sites:
vents <- read.csv("VentSites.csv")
str(vents)
## 'data.frame': 25 obs. of 3 variables:
## $ Long: num 122 122 123 123 123 ...
## $ Lat : num 24.8 24.8 25.1 24.8 24.9 ...
## $ Name: Factor w/ 25 levels "Akuseki-jima",..: 11 12 21 24 4 5 20 22 23 8 ...
Make plot one layer at a time:
#tiff("MiraiFiltersMap.tiff", height= 8, width =8, units = 'in', res=300) # uncomment to save plot to file
plot(b, image = TRUE, land = TRUE, xlim=c(120,135), ylim=c(22, 32), lwd = 0.03, bpal = list(c(0, max(b), grey(.7), grey(.9), grey(.95)),
c(min(b), 0, "darkblue", "lightblue")))
#plot(b, image = TRUE, land = TRUE, xlim=c(120,135), ylim=c(22, 32), lwd = 0.03, bpal = Newblues(100), cex.lab=1.5, cex.axis=1.5, cex.main=1.5, cex.sub=1.5)
#bathymetry colors
plot(b, lwd = 0.8, deep = 0, shallow = 0, step = 0, add = TRUE) # highlight coastline with solid black line
plot(b, deep=-200, shallow=-200, lwd = 0.4, drawlabels=T, add=T) # Add -200m isobath
plot(b, deep=-1000, shallow=-1000, lwd = 0.4, drawlabels=T, add=T) # Add -1000m isobath
plot(b, deep=-2000, shallow=-2000, lwd = 0.4, drawlabels=T, add=T) # Add -2000m isobath
plot(b, deep=-3000, shallow=-3000, lwd = 0.4, drawlabels=T, add=T) # Add -3000m isobath
points(pts, pch = 16, col = 'red') # Add sampling points
text(pts, labels=pts$Station, adj= 1.5, cex=1.2, size = 4) # Add sampling station numbers
text(Countries, labels=Countries$CLabels, cex= 1.7, adj = -0.2) # Add map labels
points(vents, pch = 17, col = 'blue', size = 3)
#dev.off() #uncomment to save plot to file
Sample collection, DNA extraction, PCR, & sequencing
Seawater collected from the subsurface chlorophyll maximum (SCM), mid-water column, and near-bottom was collected in 10 L Niskin bottles at each sampling station and surface seawater was collected by bucket. Five Liter replicates from each depth (4.5 from surface) were sequentially filtered through 10.0 and 0.2 um pore-size Polytetrafluoroethylene (PTFE) filters. Filters were flash frozen in liquid nitrogen and stored at -80 C until DNA extraction. DNA was extracted following the manufacturers protocols for the Qiagen/MoBio DNeasy PowerWater Kit, including the optional heating step to lyse cells. Amplicon PCR and sequencing library preparation followed the procedures described in the Illumina 16S Metagenomic Sequencing Library Preparation manual modified to include universal eukaryotic primers for the v4 region of the 18S rRNA gene (F: Stoeck et al. 2010, R: Brisbin et al. 2018) and amplicon PCR conditions most appropriate for these primers (annealing at 58C). The 220 samples were separated into 4 sequencing runs on the Illumina MiSeq sequencing platform with v3 chemistry for 2x300-bp sequencing.
Bioinformatic processing with QIIME 2
Demultiplexed paired-end sequences provided by the OIST sequencing center were imported to QIIME 2 (v2018.11) software (Bolyen et al. 2019). The Divisive Amplicon Denoising Algorithm (DADA) was implemented with the DADA2 plug-in for Qiime2 for quality filtering, chimera removal, and feature table construction (Callahan et al. 2015) separately for each sequencing run before merging the four resulting feature tables. Taxonomy was assigned to Amplicon Sequence Variants (ASVs) in the feature table by training a classifier on the Protist Ribosomal Reference (PR2) database to classify ASVs (Guillo et al. 2012). The feature table and assigned taxonomy were then exported from Qiime2 for further analysis.
Below is an example Qiime2 workflow for 1 sequencing run:
Activate an anaconda environment for qiime2: source activate qiime2-2018.11
Import sequencing data (all sequences for eacg sequencing run must be in their own directory):
qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path ./seqs/seqrun1/ --input-format CasavaOneEightSingleLanePerSampleDirFmt --output-path run1_demux-paired-end.qza
Visualize squencing results (use these graphs to decide what length to trim sequences)
qiime demux summarize --i-data run1_demux-paired-end.qza --o-visualization run1_demux.qzv
Run the DADA2 algorithm:
qiime dada2 denoise-paired --i-demultiplexed-seqs run1_demux-paired-end.qza --output-dir ./dd2run1 --o-representative-sequences run1_rep-seqs --p-trim-left-f 10 --p-trim-left-r 10 --p-trunc-len-f 260 --p-trunc-len-r 250 --p-n-threads 3
Check results:
qiime feature-table summarize --i-table ./dd2run1/table.qza --o-visualization ./dd2run1/table.qzv --m-sample-metadata-file ~/desktop/MiraiFilters/sampleMaptxt.txt
qiime metadata tabulate --m-input-file dd2run1/denoising_stats.qza --o-visualization dd2run1/denoising_stats.qzv
Merge feature tables from multiple sequencing runs:
qiime feature-table merge --i-tables ./dd2run1/table.qza --i-tables ./dd2run2/table.qza --i-tables ./dd2run3/table.qza --i-tables ./dd2run4/table.qza --o-merged-table mergedtable.qza
qiime feature-table merge-seqs --i-data run1_rep-seqs.qza --i-data run2_rep-seqs.qza --i-data run3_rep-seqs.qza --i-data run4_rep-seqs.qza --o-merged-data merged-rep-seqs.qza
Import taxonomy database:
qiime tools import --type 'FeatureData[Sequence]' --input-path pr2_version_4.11.1_mothur.fasta --output-path pr2.qza
qiime tools import --type 'FeatureData[Taxonomy]' --input-format HeaderlessTSVTaxonomyFormat --input-path pr2_version_4.11.1_mothur.tax -output-path pr2_tax.qza
Select V4 region from PR2 sequences:
qiime feature-classifier extract-reads --i-sequences pr2.qza --p-f-primer CCAGCASCYGCGGTAATTCC --p-r-primer ACTTTCGTTCTTGATYRA --o-reads pr2_v4.qza
Train the classifier:
qiime feature-classifier fit-classifier-naive-bayes --i-reference-reads pr2_v4.qza --i-reference-taxonomy pr2_tax.qza --o-classifier pr2_v4_classifier.qza
Classify:
qiime feature-classifier classify-sklearn --i-classifier pr2_v4_classifier.qza --i-reads merged-rep-seqs.qza --o-classification taxonomy.qza
qiime metadata tabulate --m-input-file taxonomy.qza --o-visualization taxonomy.qzv
Export the .tsv taxonomy file from taxonomy.qzv for use in following steps.
Load packages:
library(tidyverse)
library(qiime2R)
library('phyloseq')
library('vegan')
library('ggplot2')
library(RColorBrewer)
library(DESeq2)
library(reshape)
library("wesanderson")
library("gridExtra")
library("metagMisc")
library("wesanderson")
library("breakaway")
library("CoDaSeq")
library("ggbiplot")
library(ggpubr)
Set default colors:
ap<- c("#F1B6A1", "#D4A52A", "#E3E5DB", "#A5CFCC", "#0E899F", "#A83860", "#ED91BC", "#DB5339", "#F58851", "#42465C", "#1E479A", "#F7CDA4", "#CF529C", "#11638C")
Convert QIIME 2 artifacts to phyloseq objects:
phyloseq<-qza_to_phyloseq(features="mergedtable.qza")
Import sample data:
metatable <- read.csv("sampleMap.csv")
row.names(metatable) <- metatable[[1]]
metatable$Station <- as.character(metatable$Station)
metatable$filter <- as.character(metatable$filter)
metatable$VentANAHTM<-as.factor(metatable$VentANAHTM)
META<- sample_data(metatable)
str(META)
## 'data.frame': 219 obs. of 11 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
## ..@ .Data :List of 11
## .. ..$ : Factor w/ 219 levels "A6-F173-DNA",..: 3 6 9 12 96 97 115 118 129 134 ...
## .. ..$ : chr "2" "2" "2" "2" ...
## .. ..$ : Factor w/ 4 levels "Bottom","Mid",..: 4 4 4 4 1 1 2 2 3 3 ...
## .. ..$ : int 0 0 0 0 1000 1000 700 700 92 92 ...
## .. ..$ : chr "10" "2" "10" "2" ...
## .. ..$ : Factor w/ 2 levels "n","y": 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ : Factor w/ 4 levels "KG","MOT","NOT",..: 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ : Factor w/ 3 levels "KG","NOT","SOT": 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ : Factor w/ 3 levels "n","p","y": 3 3 3 3 3 3 3 3 3 3 ...
## .. ..$ : Factor w/ 4 levels "KG","NOT","O",..: 1 1 1 1 1 1 1 1 1 1 ...
## .. ..$ : Factor w/ 2 levels "n","y": 2 2 2 2 2 2 2 2 2 2 ...
## ..@ names : chr "SampleID" "Station" "Depth" "m" ...
## ..@ row.names: chr "A8-F17-DNA" "B8-F18-DNA" "C8-F19-DNA" "D8-F20-DNA" ...
## ..@ .S3Class : chr "data.frame"
Load PR2 taxonomy:
taxonomy <- read.csv("taxonomy.csv", stringsAsFactors = FALSE)
names(taxonomy) <- c("row", "tax", "Confidence")
row.names(taxonomy) <-taxonomy[[1]]
taxonomy <- taxonomy[,(-1)]
taxonomy <- separate(taxonomy, tax, c("D0","D1", "D2", "D3", "D4", "D5", "D6", "D7", "D8", "D9", "D10", "D11", "D12", "D13", "D14"), sep = ";", fill = "right")
taxonomy <- taxonomy[,c(1:8)]
taxmat <- as.matrix(taxonomy)
TAX = tax_table(taxmat)
str(TAX)
## Formal class 'taxonomyTable' [package "phyloseq"] with 1 slot
## ..@ .Data: chr [1:23115, 1:8] "Eukaryota" "Eukaryota" "Eukaryota" "Eukaryota" ...
## .. ..- attr(*, "dimnames")=List of 2
## .. .. ..$ : chr [1:23115] "000a220fe8edd771dd82f6915627ef3e" "000c79760dd16d6692018deb8ea1b164" "000d9eaa21eaff08f34fb1a1a87601d5" "000e6f109e01181b2b1eda5d6c99319c" ...
## .. .. ..$ : chr [1:8] "D0" "D1" "D2" "D3" ...
Add taxonomy to phyloseq object:
ps = merge_phyloseq(phyloseq, TAX, META)
Prevalence plots display the total abundance of an ASV in the full data set v. the fraction of total samples that ASV is found in.
prevdf = apply(X = otu_table(ps),
MARGIN = ifelse(taxa_are_rows(ps), yes = 1, no = 2),
FUN = function(x){sum(x > 0)})
prevdf = data.frame(Prevalence = prevdf,
TotalAbundance = taxa_sums(ps),
tax_table(ps))
#plyr::ddply(prevdf, "D2", function(df1){cbind(mean(df1$Prevalence),sum(df1$Prevalence))})
Prevalence plot:
prevplot1<-ggplot(prevdf, aes(TotalAbundance, Prevalence / nsamples(ps),color=D2)) + geom_point(size = 2, alpha = 0.7) +
theme_bw()+
scale_x_log10() + xlab("Total Abundance") + ylab("Prevalence [Frac. Samples]") +
facet_wrap(~D1) + theme(legend.position="none")
prevplot1
We don’t apply prevalence filter because we are interested in rare ASVs in deep water masses and at hydrothermal vents.
OTUs <- data.frame(otu_table(ps))
Total number of ASVs in the data set:
OTUsRS<- OTUs
OTUsRS$RowSum <- rowSums(OTUsRS)
OTUsRSnoZero <- OTUsRS$RowSum!=0
sum(OTUsRSnoZero)
## [1] 22656
Total number of ASVs per sample (range and mean):
OTUs0 <- OTUs!=0 #is this number not a zero? true (1) or false (0)
csums <- colSums(OTUs0) # col sums = observed ASV richness
csumdf <- as.data.frame(csums)
max(csumdf$csums) #1906
## [1] 1906
min(csumdf$csums) #49
## [1] 49
mean(csumdf$csums) #730
## [1] 729.9194
Breakaway Modeled Richness for all samples by depth layer:
ba <- breakaway(ps)
badf<- summary(ba) %>%
add_column("SampleID" = ps %>% otu_table %>% sample_names)
badf<- merge(badf, metatable, by = "SampleID")
baPlot <- ggplot(badf, aes(x=Depth, y=estimate, fill=filter)) + geom_boxplot() + theme_bw() + scale_fill_manual(values = c("#E3E5DB", "#11638C"), labels=c("10.0", "0.2")) + theme(text = element_text(size=14)) +ylab("Richness Estimate")
baPlot
ggsave("breakaway_4depths.tiff", width = 6, height = 4)
Breakaway alpha diversity significance testing for all samples by depth layer:
bt <- betta(summary(ba)$estimate,
summary(ba)$error,
make_design_matrix(ps, "Depth"))
bt$table
## Estimates Standard Errors p-values
## (Intercept) 645.865450 19.47771 0.000
## predictorsMid -31.766181 39.48228 0.421
## predictorsSCM 379.222118 39.56365 0.000
## predictorsSurface -1.443341 38.09437 0.970
The modeled richness in the SCM is significantly higher than all other sample types, which are not significantly different from each other.
Breakaway Modeled Richness for all bottom water samples by sampling station:
bottomraw<- subset_samples(ps, Depth=="Bottom")
bba <- breakaway(bottomraw)
bbadf<- summary(bba) %>%
add_column("SampleID" = bottomraw %>% otu_table %>% sample_names)
bbadf<- merge(bbadf, metatable, by = "SampleID")
bbadf$Station_f <- factor(bbadf$Station, levels=c("11", "10", "9", "8", "3", "4", "2", "5", "12", "13", "14", "15", "17", "18"))
bbaPlot <- ggplot(bbadf, aes(x=Station_f, y=estimate, fill=filter)) + geom_boxplot() + theme_bw() + scale_fill_manual(values = c("#E3E5DB", "#11638C"), labels=c("10.0", "0.2")) + theme(text = element_text(size=14)) +ylab("Richness Estimate") +xlab("Station")
bbaPlot
ggsave("breakaway_bottom.tiff", width = 6, height = 4)
Breakaway alpha diversity significance testing for bottom samples by vent (stations 2, 10), plume (station 11), or no hydrothermal activity (all other stations):
bt <- betta(summary(bba)$estimate,
summary(bba)$error,
make_design_matrix(bottomraw, "VentPlume"))
bt$table
## Estimates Standard Errors p-values
## (Intercept) 633.6996 28.73865 0.000
## predictorsp -112.9082 104.57203 0.280
## predictorsy 137.8603 73.94473 0.062
Vent/Plume sites are not more or less diverse than sites without hydrothermal influence.
#table(tax_table(psN)[, "D1"], exclude = NULL)
ps<-subset_taxa(ps, D1 != "")
ps<-subset_taxa(ps, D1 != "NA")
ps<-subset_taxa(ps, D2 != "Metazoa")
OTUs <- data.frame(otu_table(ps))
Check ASV richness to see how much it changes after filtering:
OTUs0 <- OTUs!=0
csums <- colSums(OTUs0)
csumdf <- as.data.frame(csums)
max(csumdf$csums) #1800
## [1] 1800
min(csumdf$csums) #45
## [1] 45
mean(csumdf$csums) #685
## [1] 684.3128
OTU4clr<- data.frame(t(data.frame(otu_table(ps))))
row.names(OTU4clr) <- gsub("\\.", "", row.names(OTU4clr))
OTUs.clr <- codaSeq.clr(OTU4clr + 0.5, samples.by.row=TRUE)
OTU2 <- otu_table(as.matrix(OTUs.clr), taxa_are_rows = FALSE)
row.names(metatable) <- gsub("-", "", row.names(metatable))
META2<- sample_data(metatable)
psCLR <- phyloseq(OTU2,TAX,META2)
ordu = ordinate(psCLR, "PCoA", "euclidean")
p<-plot_ordination(psCLR, ordu, color="Depth", shape="filter")+theme_bw() +scale_color_manual(values=c(ap[8], ap[12], ap[5], ap[4])) + scale_shape_discrete(labels=c("10.0", "0.2")) +geom_point(size=3) +theme(text = element_text(size=16))
p
#ggsave("pcoa_allsamples.tiff", width = 6, height = 4)
The strongest determinant of clustering seems to be depth. To better observe clustering within depth groups, subset by depth and then filter type.
physeqSurf<- subset_samples(psCLR, Depth == "Surface")
ordu = ordinate(physeqSurf, "PCoA", "euclidean")
pSurf<-plot_ordination(physeqSurf, ordu, color="Depth", shape="filter")+theme_bw() +scale_color_manual(values= c(ap[4]))+ geom_point(size=3) +ggtitle("Surface") +theme(text = element_text(size=16)) + guides(color = FALSE) + scale_shape_discrete(labels=c("10.0", "0.2"))
pSurf
Samples cluster by filter type.
PCA
physeqSurf<- subset_samples(psCLR, Depth == "Surface" & filter == "10")
OTUsSurf10 <- data.frame(otu_table(physeqSurf))
meta <- metatable[row.names(metatable) %in% row.names(OTUsSurf10),]
Surface_10<- prcomp(OTUsSurf10)
plot(Surface_10, type="lines")
PCA_Surf10<-ggbiplot(Surface_10, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE) +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2]))+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Surface 10.0")
PCA_Surf10
PERMANOVA
set.seed(1)
adonis(vegdist(OTUsSurf10, method = "euclidean") ~ Region_SKN, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsSurf10, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 5544 2772.1 1.7612 0.1235 0.005 **
## Residuals 25 39350 1574.0 0.8765
## Total 27 44894 1.0000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Pairwise PERMANOVA
tst<-adonis_pairwise(x=meta, dd=vegdist(OTUsSurf10, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab
## Comparison R2 F df p p.adj
## 1 KG.NOT 0.12504910 2.5725831 1;18 0.002 0.0060
## 2 KG.SOT 0.06560963 0.9830311 1;14 0.455 0.4550
## 3 NOT.SOT 0.08495903 1.6712504 1;18 0.007 0.0105
PCA
physeqSurf2<- subset_samples(psCLR, Depth == "Surface" & filter == "2")
#tiff("variance_surface2.tiff", height= 8, width =8, units = 'in', res=70) # uncomment to save plot to file
OTUsSurf2 <- data.frame(otu_table(physeqSurf2))
meta <- metatable[row.names(metatable) %in% row.names(OTUsSurf2),]
surface2 <- prcomp(OTUsSurf2)
plot(surface2, type="lines")
#dev.off()
PCA_Surf2<-ggbiplot(surface2, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2]))+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Surface 0.2")
PCA_Surf2
PERMANOVA
set.seed(1)
adonis(vegdist(OTUsSurf2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsSurf2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 8964 4481.9 2.673 0.18217 0.001 ***
## Residuals 24 40241 1676.7 0.81783
## Total 26 49205 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Pairwise PERMANOVA
tst<-adonis_pairwise(x=meta, dd=vegdist(OTUsSurf2, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab
## Comparison R2 F df p p.adj
## 1 KG.NOT 0.17432173 3.800259 1;18 0.001 0.0015
## 2 KG.SOT 0.09142048 1.308049 1;13 0.151 0.1510
## 3 NOT.SOT 0.13974110 2.761493 1;17 0.001 0.0015
physeqSCM<- subset_samples(psCLR, Depth == "SCM")
ordu = ordinate(physeqSCM, "PCoA", "euclidean")
pSCM<-plot_ordination(physeqSCM, ordu, color="Depth", shape="filter")+theme_bw() +scale_color_manual(values=c(ap[5]))+ geom_point(size=3) +ggtitle("SCM") +theme(text = element_text(size=16))
pSCM
Again, strong clustering by filter type.
PCA
physeqSCM10<- subset_samples(psCLR, Depth == "SCM" & filter =="10")
OTUsSCM10 <- data.frame(otu_table(physeqSCM10))
meta <- metatable[row.names(metatable) %in% row.names(OTUsSCM10),]
SCM10 <- prcomp(OTUsSCM10)
plot(SCM10, type="lines")
PCA_SCM10<-ggbiplot(SCM10, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2]))+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("SCM 10.0")
PCA_SCM10
PERMANOVA
set.seed(1)
adonis(vegdist(OTUsSCM10, method = "euclidean") ~ Region_SKN, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsSCM10, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 10012 5006.2 1.4675 0.10897 0.011 *
## Residuals 24 81871 3411.3 0.89103
## Total 26 91883 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
pairwise PERMANOVA
tst<-adonis_pairwise(x=meta, dd=vegdist(OTUsSCM10, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab
## Comparison R2 F df p p.adj
## 1 KG.NOT 0.09150520 1.712270 1;17 0.010 0.0285
## 2 KG.SOT 0.08412546 1.194084 1;13 0.174 0.1740
## 3 NOT.SOT 0.07616598 1.484019 1;18 0.019 0.0285
PCA
physeqSCM2<- subset_samples(psCLR, Depth == "SCM" & filter == "2")
OTUsSCM2 <- data.frame(otu_table(physeqSCM2))
meta <- metatable[row.names(metatable) %in% row.names(OTUsSCM2),]
SCM2 <- prcomp(OTUsSCM2)
plot(SCM2, type="lines")
PCA_SCM2<-ggbiplot(SCM2, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2]))+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("SCM 0.2")
PCA_SCM2
PERMANOVA
set.seed(1)
adonis(vegdist(OTUsSCM2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsSCM2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 6674 3336.8 1.1823 0.09705 0.153
## Residuals 22 62094 2822.4 0.90295
## Total 24 68768 1.00000
physeqMid<- subset_samples(psCLR, Depth == "Mid")
ordu = ordinate(physeqMid, "PCoA", "euclidean")
pMid<-plot_ordination(physeqMid, ordu, color="Depth", shape = "filter")+theme_bw() +scale_color_manual(values=ap[12])+ geom_point(size=3) +ggtitle("Mid")+theme(text = element_text(size=16))
pMid
Samples still cluster by filter pore-size, but now on the secondary axis.
PCA
physeqMid10<- subset_samples(psCLR, Depth == "Mid" & filter == "10")
OTUsMid10 <- data.frame(otu_table(physeqMid10))
meta <- metatable[row.names(metatable) %in% row.names(OTUsMid10),]
Mid10 <- prcomp(OTUsMid10)
plot(Mid10, type="lines")
PCA_Mid10<-ggbiplot(Mid10, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2]))+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Mid 10.0")
PCA_Mid10
PERMANOVA by Region
adonis(vegdist(OTUsMid10, method = "euclidean") ~ Region_SKN, data = meta, permutations = 999)
##
## Call:
## adonis(formula = vegdist(OTUsMid10, method = "euclidean") ~ Region_SKN, data = meta, permutations = 999)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 4931 2465.6 1.2669 0.09923 0.075 .
## Residuals 23 44763 1946.2 0.90077
## Total 25 49694 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
not significant at our cut-off
PCA
physeqMid2<- subset_samples(psCLR, Depth == "Mid" & filter == "2")
OTUsMid2 <- data.frame(otu_table(physeqMid2))
meta <- metatable[row.names(metatable) %in% row.names(OTUsMid2),]
Mid2 <- prcomp(OTUsMid2)
plot(Mid2, type="lines")
PCA_Mid2<-ggbiplot(Mid2, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2]))+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Mid 0.2")
PCA_Mid2
#ggsave("PCA_Mid2.png")
PERMANOVA by Region
set.seed(1)
adonis(vegdist(OTUsMid2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsMid2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 6883 3441.3 1.6878 0.13302 0.001 ***
## Residuals 22 44857 2038.9 0.86698
## Total 24 51739 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
pairwise PERMANOVA
tst<-adonis_pairwise(x=meta, dd=vegdist(OTUsMid2, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab
## Comparison R2 F df p p.adj
## 1 KG.NOT 0.12483288 2.139583 1;15 0.009 0.0270
## 2 KG.SOT 0.11303007 1.401773 1;11 0.062 0.0620
## 3 NOT.SOT 0.07811164 1.525141 1;18 0.019 0.0285
physeqBottom<- subset_samples(psCLR, Depth == "Bottom")
ordu = ordinate(physeqBottom, "PCoA", "euclidean")
pBottom<-plot_ordination(physeqBottom, ordu, color="Depth", shape = "filter")+theme_bw() +scale_color_manual(values=ap[8])+ geom_point(size=3) +ggtitle("Bottom")+theme(text = element_text(size=16))
pBottom
No clustering by filter type.
For consistency, split by filter type for the NOT/SOT/KG biogeography:
PCA
physeqBottom10<- subset_samples(psCLR, Depth == "Bottom" & filter == "10")
OTUsBottom10 <- data.frame(otu_table(physeqBottom10))
meta <- metatable[row.names(metatable) %in% row.names(OTUsBottom10),]
Bottom10 <- prcomp(OTUsBottom10)
plot(Bottom10, type="lines")
PCA_Bottom10<-ggbiplot(Bottom10, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2]))+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Bottom 10.0")
PCA_Bottom10
PERMANOVA by Region
set.seed(1)
adonis(vegdist(OTUsBottom10, method = "euclidean") ~ Region_SKN , data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsBottom10, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 5815 2907.4 1.2317 0.09309 0.068 .
## Residuals 24 56651 2360.4 0.90691
## Total 26 62465 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
not significant at our cut-off
PCA
physeqBottom2<- subset_samples(psCLR, Depth == "Bottom" & filter == "2")
OTUsBottom2 <- data.frame(otu_table(physeqBottom2))
meta <- metatable[row.names(metatable) %in% row.names(OTUsBottom2),]
Bottom2 <- prcomp(OTUsBottom2)
plot(Bottom2, type="lines")
PCA_Bottom2<-ggbiplot(Bottom2, var.axes = FALSE, groups= meta$Region_SKN, ellipse=TRUE ) +theme_bw() +scale_color_manual(values=c(ap[4], ap[11], ap[2]))+ theme(text = element_text(size=14)) + theme(legend.position="bottom") +ggtitle("Bottom 0.2")
PCA_Bottom2
PERMANOVA by region
set.seed(1)
adonis(vegdist(OTUsBottom2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsBottom2, method = "euclidean") ~ Region_SKN, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## Region_SKN 2 7125 3562.5 1.5119 0.11619 0.007 **
## Residuals 23 54197 2356.4 0.88381
## Total 25 61322 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Pairwise PERMANOVA
tst<-adonis_pairwise(x=meta, dd=vegdist(OTUsBottom2, method = "euclidean"), group.var="Region_SKN")
tst$Adonis.tab
## Comparison R2 F df p p.adj
## 1 KG.NOT 0.06908014 1.187301 1;16 0.165 0.165
## 2 KG.SOT 0.09775585 1.408517 1;13 0.030 0.045
## 3 NOT.SOT 0.10567637 2.008779 1;17 0.001 0.003
Pattern reversed from that seen in Mid, SCM, and Surface
Supplemental Figure 1
supp1<-ggarrange(pSurf, pSCM, pMid, pBottom, common.legend = TRUE, legend = "bottom")
supp1
#ggsave("Supplemental1.png", width = 6, height = 6 )
Bottom-water by Hydrothermal Activity
For Vent v. Plume - use 10.0 and 0.2 samples together because 0.2 and 10.0 pore-size filters do not cluster separately in the PCoA and necessary because there are a low number of vent sites compared to non-vent sites
PERMANOVA
set.seed(1)
physeqBottom<- subset_samples(psCLR, Depth == "Bottom")
OTUsBottom <- data.frame(otu_table(physeqBottom))
meta <- metatable[row.names(metatable) %in% row.names(OTUsBottom),]
adonis(vegdist(OTUsBottom, method = "euclidean") ~ VentPlume, data = meta)
##
## Call:
## adonis(formula = vegdist(OTUsBottom, method = "euclidean") ~ VentPlume, data = meta)
##
## Permutation: free
## Number of permutations: 999
##
## Terms added sequentially (first to last)
##
## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
## VentPlume 2 9303 4651.3 1.9195 0.0713 0.001 ***
## Residuals 50 121160 2423.2 0.9287
## Total 52 130462 1.0000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
pairwise PERMANOVA
tst<-adonis_pairwise(x=meta, dd=vegdist(OTUsBottom, method = "euclidean"), group.var="VentPlume")
tst$Adonis.tab
## Comparison R2 F df p p.adj
## 1 n.p 0.03681976 1.6437730 1;43 0.012 0.018
## 2 n.y 0.04858759 2.4002384 1;47 0.001 0.003
## 3 p.y 0.08956956 0.9838155 1;10 0.429 0.429
PCA
Bottom <- prcomp(OTUsBottom)
plot(Bottom, type="lines")
PCA_Bottom<-ggbiplot(Bottom, var.axes = FALSE, groups= meta$VentPlume, ellipse=TRUE ) +theme_bw() +scale_color_manual(values= c(wes_palette("Cavalcanti1")[4], wes_palette("Cavalcanti1")[1], wes_palette("Cavalcanti1")[5] ), labels= c("none", "plume", "vent"))+ theme(text = element_text(size=16)) + theme(legend.position="bottom")
PCA_Bottom
Merge Samples –> collapse replicates (for Relative Abundance Plots)
Prepare metadata table:
metatable <- read.csv("sampleMap.csv")
row.names(metatable) <- metatable[[1]]
metatable$desc <- paste(metatable$Station, metatable$Depth, metatable$filter, sep="-")
metatable$Depth <- as.character(metatable$Depth)
metatable$Station <-as.character(metatable$Station)
metatable$filter<-as.character(metatable$filter)
metatable<- metatable[,-which(names(metatable) == "SampleID")]
META<- sample_data(metatable)
ps2<- ps
sample_data(ps2) <- META
print(ps2)
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 19574 taxa and 211 samples ]
## sample_data() Sample Data: [ 211 samples by 11 sample variables ]
## tax_table() Taxonomy Table: [ 19574 taxa by 8 taxonomic ranks ]
str(sample_data(ps2))
## 'data.frame': 211 obs. of 11 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
## ..@ .Data :List of 11
## .. ..$ : chr "11" "10" "10" "10" ...
## .. ..$ : chr "Mid" "Bottom" "Mid" "SCM" ...
## .. ..$ : int 700 1510 700 85 85 85 700 700 1510 1510 ...
## .. ..$ : chr "2" "2" "10" "2" ...
## .. ..$ : Factor w/ 2 levels "n","y": 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ : Factor w/ 4 levels "KG","MOT","NOT",..: 4 4 4 4 4 4 4 4 4 4 ...
## .. ..$ : Factor w/ 3 levels "KG","NOT","SOT": 3 3 3 3 3 3 3 3 3 3 ...
## .. ..$ : Factor w/ 3 levels "n","p","y": 2 3 3 3 3 3 3 3 3 3 ...
## .. ..$ : Factor w/ 4 levels "KG","NOT","O",..: 4 4 4 4 4 4 4 4 4 4 ...
## .. ..$ : Factor w/ 2 levels "n","y": 1 2 2 2 2 2 2 2 2 2 ...
## .. ..$ : chr "11-Mid-2" "10-Bottom-2" "10-Mid-10" "10-SCM-2" ...
## ..@ names : chr "Station" "Depth" "m" "filter" ...
## ..@ row.names: chr "F142" "F150" "F151" "F154" ...
## ..@ .S3Class : chr "data.frame"
Merge:
mergedps <- merge_samples(ps2, "desc")
print(mergedps)
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 19574 taxa and 112 samples ]
## sample_data() Sample Data: [ 112 samples by 11 sample variables ]
## tax_table() Taxonomy Table: [ 19574 taxa by 8 taxonomic ranks ]
number of samples reduced from 211 to 112
sample_names(mergedps)
## [1] "10-Bottom-10" "10-Bottom-2" "10-Mid-10" "10-Mid-2"
## [5] "10-SCM-10" "10-SCM-2" "10-Surface-10" "10-Surface-2"
## [9] "11-Bottom-10" "11-Bottom-2" "11-Mid-10" "11-Mid-2"
## [13] "11-SCM-10" "11-SCM-2" "11-Surface-10" "11-Surface-2"
## [17] "12-Bottom-10" "12-Bottom-2" "12-Mid-10" "12-Mid-2"
## [21] "12-SCM-10" "12-SCM-2" "12-Surface-10" "12-Surface-2"
## [25] "13-Bottom-10" "13-Bottom-2" "13-Mid-10" "13-Mid-2"
## [29] "13-SCM-10" "13-SCM-2" "13-Surface-10" "13-Surface-2"
## [33] "14-Bottom-10" "14-Bottom-2" "14-Mid-10" "14-Mid-2"
## [37] "14-SCM-10" "14-SCM-2" "14-Surface-10" "14-Surface-2"
## [41] "15-Bottom-10" "15-Bottom-2" "15-Mid-10" "15-Mid-2"
## [45] "15-SCM-10" "15-SCM-2" "15-Surface-10" "15-Surface-2"
## [49] "17-Bottom-10" "17-Bottom-2" "17-Mid-10" "17-Mid-2"
## [53] "17-SCM-10" "17-SCM-2" "17-Surface-10" "17-Surface-2"
## [57] "18-Bottom-10" "18-Bottom-2" "18-Mid-10" "18-Mid-2"
## [61] "18-SCM-10" "18-SCM-2" "18-Surface-10" "18-Surface-2"
## [65] "2-Bottom-10" "2-Bottom-2" "2-Mid-10" "2-Mid-2"
## [69] "2-SCM-10" "2-SCM-2" "2-Surface-10" "2-Surface-2"
## [73] "3-Bottom-10" "3-Bottom-2" "3-Mid-10" "3-Mid-2"
## [77] "3-SCM-10" "3-SCM-2" "3-Surface-10" "3-Surface-2"
## [81] "4-Bottom-10" "4-Bottom-2" "4-Mid-10" "4-Mid-2"
## [85] "4-SCM-10" "4-SCM-2" "4-Surface-10" "4-Surface-2"
## [89] "5-Bottom-10" "5-Bottom-2" "5-Mid-10" "5-Mid-2"
## [93] "5-SCM-10" "5-SCM-2" "5-Surface-10" "5-Surface-2"
## [97] "8-Bottom-10" "8-Bottom-2" "8-Mid-10" "8-Mid-2"
## [101] "8-SCM-10" "8-SCM-2" "8-Surface-10" "8-Surface-2"
## [105] "9-Bottom-10" "9-Bottom-2" "9-Mid-10" "9-Mid-2"
## [109] "9-SCM-10" "9-SCM-2" "9-Surface-10" "9-Surface-2"
But metatable inside phyloseq object was disturbed …
str(sample_data(mergedps))
## 'data.frame': 112 obs. of 11 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
## ..@ .Data :List of 11
## .. ..$ : num 10 10 10 10 10 10 10 10 11 11 ...
## .. ..$ : num NA NA NA NA NA NA NA NA NA NA ...
## .. ..$ : num 1510 1510 700 700 85 85 0 0 1210 1210 ...
## .. ..$ : num 10 2 10 2 10 2 10 2 10 2 ...
## .. ..$ : num 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ : num 4 4 4 4 4 4 4 4 4 4 ...
## .. ..$ : num 3 3 3 3 3 3 3 3 3 3 ...
## .. ..$ : num 3 3 3 3 3 3 3 3 2 2 ...
## .. ..$ : num 4 4 4 4 4 4 4 4 4 4 ...
## .. ..$ : num 2 2 2 2 2 2 2 2 1 1 ...
## .. ..$ : num NA NA NA NA NA NA NA NA NA NA ...
## ..@ names : chr "Station" "Depth" "m" "filter" ...
## ..@ row.names: chr "10-Bottom-10" "10-Bottom-2" "10-Mid-10" "10-Mid-2" ...
## ..@ .S3Class : chr "data.frame"
Fix meta table in phyloseq object:
meta2<-as.data.frame(sample_data(mergedps))
split<- do.call(rbind, strsplit(row.names(meta2), "-"))
meta2$Depth <- split[,2]
meta2$filter<-split[,3]
meta2$desc <- row.names(meta2)
meta2$Station <- as.character(meta2$Station)
meta2$Depth_f <- factor(meta2$Depth, levels=c('Surface','SCM','Mid','Bottom'))
meta2$Station_f <- factor(meta2$Station, levels=c("11", "10", "9", "8", "3", "4", "2", "5", "12", "13", "14", "15", "17", "18"))
META <-sample_data(meta2)
sample_data(mergedps)<-META
Fixed!
str(sample_data(mergedps))
## 'data.frame': 112 obs. of 13 variables:
## Formal class 'sample_data' [package "phyloseq"] with 4 slots
## ..@ .Data :List of 13
## .. ..$ : chr "10" "10" "10" "10" ...
## .. ..$ : chr "Bottom" "Bottom" "Mid" "Mid" ...
## .. ..$ : num 1510 1510 700 700 85 85 0 0 1210 1210 ...
## .. ..$ : chr "10" "2" "10" "2" ...
## .. ..$ : num 2 2 2 2 2 2 2 2 2 2 ...
## .. ..$ : num 4 4 4 4 4 4 4 4 4 4 ...
## .. ..$ : num 3 3 3 3 3 3 3 3 3 3 ...
## .. ..$ : num 3 3 3 3 3 3 3 3 2 2 ...
## .. ..$ : num 4 4 4 4 4 4 4 4 4 4 ...
## .. ..$ : num 2 2 2 2 2 2 2 2 1 1 ...
## .. ..$ : chr "10-Bottom-10" "10-Bottom-2" "10-Mid-10" "10-Mid-2" ...
## .. ..$ : Factor w/ 4 levels "Surface","SCM",..: 4 4 3 3 2 2 1 1 4 4 ...
## .. ..$ : Factor w/ 14 levels "11","10","9",..: 2 2 2 2 2 2 2 2 1 1 ...
## ..@ names : chr "Station" "Depth" "m" "filter" ...
## ..@ row.names: chr "10-Bottom-10" "10-Bottom-2" "10-Mid-10" "10-Mid-2" ...
## ..@ .S3Class : chr "data.frame"
Taxa glom - for clean RA plots (so that there is a not a line in the plot for each ASV)
mergedps<- subset_taxa(mergedps, D2 != "Metazoa")
mergedps<-subset_taxa(mergedps, D1 != "")
mergedps<-subset_taxa(mergedps, D1 != "NA")
mergedps = tax_glom(mergedps, "D2")
'%ni%' <- Negate('%in%')
LowPrev<- c("Alveolata_X", "Amoebozoa_X", "Apicomplexa", "Apusomonadidae", "Conosa", "Discoba", "Eukaryota_XX", "Foraminifera", "Hilomonadea","Fungi", "Katablepharidophyta", "Lobosa", "Mesomycetozoa", "Metamonada", "Metazoa", "Opisthokonta_X", "Perkinsea", "Streptophyta", "Rhodophyta","Telonemia", "")
mergedHighPrev<- subset_taxa(mergedps, D2 %ni% LowPrev)
mergedRAhp <- transform_sample_counts(mergedHighPrev, function(OTU) 100* OTU/sum(OTU))
ap2<- c("#F7CDA4", "#DB5339", "#E3E5DB", "#A5CFCC", "#11638C", "#A83860","#D4A52A", "#CF529C", "#1E479A", "#F1B6A1", "#42465C", "#0E899F")
taxabarplot<-plot_bar(mergedRAhp, x= "Station_f", fill = "D2", facet_grid=filter~Depth_f) + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values= ap2) + theme(legend.title=element_blank()) + geom_bar(aes(fill=D2), stat="identity", position="stack", width =0.9) +theme_classic() + theme(text = element_text(size=14))+theme(axis.text.x = element_text(angle = 90)) + xlab("Station") +ylab("Relative Abundance(%)")
taxaplot_nolegend <- taxabarplot + theme(legend.position="none")
taxaplot_nolegend
#ggsave("taxaplot_nolegend_newcolors.pdf", width = 8, height = 5)
taxabarplotD1<-plot_bar(mergedRAhp, x= "Station_f", fill = "D1", facet_grid=filter~Depth_f) + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values=ap) + theme(legend.title=element_blank()) + geom_bar(aes(fill=D1), stat="identity", position="stack", width =0.9) +theme_classic() + theme(text = element_text(size=14))+theme(axis.text.x = element_text(angle = 90)) + xlab("Station") +ylab("Relative Abundance(%)")
taxaplotD1_nolegend <- taxabarplotD1 + theme(legend.position="none")
taxaplotD1_nolegend
BottomTaxa<-subset_samples(mergedRAhp, Depth == "Bottom")
Bottombarplot<-plot_bar(BottomTaxa, x= "Station_f", fill = "D2", facet_grid=~filter) + scale_y_continuous(expand = c(0, 0)) + ggtitle("") + scale_fill_manual(values=ap2) + theme(legend.title=element_blank()) + geom_bar(aes(fill=D2), stat="identity", position="stack", width =0.9) +theme_classic() + theme(text = element_text(size=14))+theme(axis.text.x = element_text(angle = 90)) + xlab("Station") +ylab("Relative Abundance(%)")
Bottombarplot_nolegend <- Bottombarplot+ theme(legend.position="none")
Bottombarplot_nolegend
#ggsave("Bottombarplot_nolegend.pdf", width=8, height =4)
subset rawcount phyloseq object to just hold bottom samples:
bottomraw<- subset_samples(ps, Depth=="Bottom")
First, with only station 2 and 10 considered as hydrothermal sites:
ddse <- phyloseq_to_deseq2(bottomraw, ~VentANAHTM)
## converting counts to integer mode
ddse2 <- DESeq(ddse, test="Wald", fitType="parametric")
res<- results(ddse2, cooksCutoff = FALSE)
alpha = 0.01
sigtab = res[which(res$padj < alpha), ]
dim(sigtab)
## [1] 24 6
And if station 11 is included …
ddse <- phyloseq_to_deseq2(bottomraw, ~Vent)
## converting counts to integer mode
ddse2 <- DESeq(ddse, test="Wald", fitType="parametric")
res<- results(ddse2, cooksCutoff = FALSE)
alpha = 0.01
sigtab = res[which(res$padj < alpha), ]
dim(sigtab)
## [1] 46 6
sigtab = cbind(as(sigtab, "data.frame"), as(tax_table(bottomraw)[rownames(sigtab), ], "matrix"))
sigtab = sigtab[sigtab$D2 != "Fungi",] # this fungus classification seems to be a mistake (based on original publication) and would otherwise only be classified as Eukaryota and, therefore, filtered from our data set at the very beginning of the analysis
x = tapply(sigtab$log2FoldChange, sigtab$D2, function(x) max(x))
x = sort(x, TRUE)
sigtab$D2 = factor(as.character(sigtab$D2), levels=names(x))
#D3 level
x = tapply(sigtab$log2FoldChange, sigtab$D3, function(x) max(x))
x = sort(x, TRUE)
sigtab$D3 = factor(as.character(sigtab$D3), levels=names(x))
# D4 level
x = tapply(sigtab$log2FoldChange, sigtab$D4, function(x) max(x))
x = sort(x, TRUE)
sigtab$D4 = factor(as.character(sigtab$D4), levels=names(x))
#plot F0
sigplot<- ggplot(sigtab, aes(x=D2, y=log2FoldChange, color = D3)) + geom_point(size=5, alpha = 0.7) + theme(legend.title=element_blank()) + theme_bw() +
ggtitle(" ") +
theme(axis.text.x = element_text(angle = -45, hjust = 0, vjust=0.75)) + scale_color_manual(values=ap) + theme(text = element_text(size=16))
noLegendsigplot <- sigplot +theme(legend.position = "none")
noLegendsigplot
#ggsave("sigplotnoLegend.pdf", height = 4, width = 6)
sigplot
#ggsave("sigplot.pdf")